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Proposed Examiner's Amendment 

1 . (Previously presented) A method performed by a server device, of identifying 
whether a sequence of terms is a semantic unit, the method comprising: 

receiving, by a communication interface or an input device of the server device, 
the sequence of terms in a memory; 

calculating, by a processor of the server device, a first value representing a 
coherence of terms in the sequence; 

calculating, by the processor, a second value representing variation of context in 
which the sequence occurs; 

comparing, by the processor, the first value to a first threshold and the second 
value to a second threshold; 

identifying, by the processor, that the sequence is a semantic unit based at least in 
part on the first value satisfying the first threshold and the second value satisfying the 
second threshold; and 

outputting, by the communication interface or an output device of the server 
device, an indication that the sequence is a semantic unit based on identifying that the 
sequence is a semantic unit. 

2. (Previously presented) The method of claim 1, where the coherence of the 
terms in the sequence is calculated relative to a collection of documents. 
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3. (Previously presented) The method of claim 2, where the coherence of the 
terms in the sequence is calculated as a likelihood ratio that defines a probability of the 
sequence occurring in the collection of documents relative to parts of the sequence 
occurring. 

4. (Currently amended) The method of claim 2, where the coherence of the 
terms in the sequence is calculated as: 



LR(A,B) = L( ^ W 



L(f(AB),f(A))-L(f(~ AB),f(~ A)) 

where /(A) defines a number of occurrences of term A in the collection of documents, 
f(~A) defines a number of occurrences of a term other than term A in the collection of 
documents, f{B) defines a number of occurrences of term B in the collection of 
documents, N defines a total number of events in the collection of documents, f(AB) 
defines a number of times term A is followed by term B in the collection of documents, 
and/(~A5) is a number of times a term other than A is followed by term B in the 
collection of documents, where 



KM)=f-T-fi-- yn 



where n and k are integers . 



5. (Canceled) 
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6. (Previously presented) The method of claim 1, where the first threshold is 



in the collection of documents, /(fl) defines a number of occurrences of term B in the 
collection of documents, N defines a total number of events in the collection of 
documents, and/(Afi) defines a number of times term A is followed by term B in the 
collection of documents. 

7. (Previously presented) The method of claim 1, where the variation of 
context in which the sequence occurs is calculated relative to a collection of documents. 

8. (Previously presented) The method of claim 7, where the variation of 
context in which the sequence occurs is calculated as a measure of entropy of the context 
of the sequence. 

9. (Currently amended) The method of claim [[7]] 8, where the variation 
measure of entropy of the context in which of the sequence occurs , H(S), is calculated as 



defined as: f(AB) > 



f(A)-f(B) 



, where /(A) defines a number of occurrences of term A 



HM(S) = MIN(HL(S),HR(S)) 



H(S) = MIN{HL(S),HR(S)), 
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and 



HR(S) = -] 




where MIN defines a minimum operation, S represents the sequence, HL(S) represents 
an entropy to the left of the sequence S, HR(S) represents an entropy to the right of the 
sequence S, f(wS) defines a number of times a particular term, w, appears in the collection 
of documents followed by the sequence S,f(Sw) refers to a number of times the sequence 
S is followed by w in the collection of documents, and f(S) refers to a number of times the 
sequence S is present in the collection of documents. 

10. (Currently amended) The method of claim 7, where the variation of 
context in which the sequence occurs, HM(S), is calculated as 



where MIN defines a minimum operation, S represents the sequence, HLMfS) is defined 



defined as a minimum of 1 for each term w in the collection of documents, 

f(S) 

f(wS) defines a number of times a particular term, w, appears in the collection of 
documents followed by the sequence, f(Sw ) refers to a number of times the sequence is 
followed by w in the collection of documents, dcadf(S) refers to a number of times the 
sequence S is present in the collection of documents. 



HM(S) = MIN(HLM(S),HRM(S)), 



as a minimum of 1 — — for each term w in the collection of documents, HRM(S) is 




f(Sw) 
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1 1 . (Currently amended) The method of claim 7, where the variation of 
context in which the sequence occurs, HQS), is calculated as 

HC(S) = MIN(HLC(S\HRC(S)) , 
where MIN defines a minimum operation, S represents the sequence, HLC(S) is defined 
as ^S(wS) and HRC(S) is defined as ^S(Sw) , where S(X) is defined as one if a 

sequence X occurs in the collection of documents and zero otherwise, where wS refers to 
a particular word w followed by the sequence S, and where Sw refers to the sequence S 
followed by [[a]] the word w. 

12. (Currently amended) The method of claim 7, where the variation of 
context in which the sequence occurs, HP(S), is calculated as 

HP(S) = MIN(HLP(S),HRP(S)) 
where MIN defines a minimum operation, S represents the sequence, HLP(S) is defined 
as the number of continuations to the left of the sequence that cover a predetermined 
percentage of all cases in the collection of documents and HRP(S) is defined as the 
number of continuations to the right of the sequence that cover the predetermined 
percentage of all cases in the collection of documents. 

13. (Canceled) 



14. (Previously presented) The method of claim 1, where the sequence 
includes three or more words. 
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15. (Previously presented) The method of claim 1, further including: 
applying one or more rules to the sequence, and 

where identifying that the sequence is a semantic unit is further based at least in 
part on the application of the one or more rules. 

16. (Previously presented) A device comprising: 
a memory to store instructions; and 

a processor to execute the instructions to implement: 

a receiving component to receive a sequence of terms; 

a coherence component to calculate a coherence of multiple terms in the 
sequence of terms; 

a variation component to calculate a variation of context terms in a 
collection of documents in which the sequence occurs, where the variation of context 
terms is calculated as a measure of entropy of the context of the sequence; and 

a decision component to determine whether the sequence constitutes a 
semantic unit based at least in part on results of the coherence component and the 
variation component, and output an indication of whether the sequence constitutes a 
semantic unit for use in a processor. 

17. (Previously presented) The device of claim 16, where the context terms 
include terms to the left and right of the sequence. 
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18. (Previously presented) The device of claim 16, where the coherence of 
the terms in the sequence is calculated relative to the collection of documents. 

19. (Previously presented) The device of claim 18, where the coherence of 
the terms in the sequence is calculated as a likelihood ratio that defines a probability of 
the sequence occurring in the collection of documents relative to parts of the sequence 
occurring. 

20. (Canceled) 

21. (Currently amended) The device of claim 16, where the variation the 
measure of entropy of die context in which of the sequence occurs , H(S), is calculated as 



where MIN defines a minimum operation, S represents the sequence, HL(S) represents an 
entropy to the left of the sequence S, HR('S) represents an entropy to the right of the 
sequence S, f(wS) defines a number of times a particular term, w, appears in the collection 
of documents followed by the sequence, f(Sw) refers to a number of times the sequence S 
is followed by w in the collection of documents, and f(S) refers to a number of times the 
sequence S is present in the collection of documents. 



H(S) = MIN{HL(S),HR(S)), 




and 
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22. (Currently amended) The device of claim 16, where the variation of 
context in which the sequence occurs, HM(S), is calculated as 

IIM(S) - MAX(lILM(S)JIRM{Sj) 

HM(S) = MIN(HLM(S),HRM(S)), 
where MIN defines a minimum operation, S represents the sequence, HLM(S) is defined 

as a minimum of \-^—^- for each term w in the collection of documents, HRM(S) is 
f(S) 

defined as a minimum of 1- f^ w ^ for each term w in the collection of documents, 

f(S) 

f(wS) defines a number of times a particular term, w, appears in the collection of 
documents followed by the sequence S,f(Sw) refers to a number of times the sequence S 
is followed by w in the collection of documents, and f(S) refers to a number of times the 
sequence S is present in the collection of documents. 



23. (Currently amended) The device of claim 16, where the variation of 
context in which the sequence occurs, HQS), is calculated as 

HC(S) = MIN(HLC(S),HRC(S)) , 
where MIN defines a minimum operation, S represents the sequence, HLC(S) is defined 
as ^S(wS) and HRC(S) is defined as ^ S(Sw) , where 8{X) is defined as one if 

sequence X occurs in the document collection and zero otherwise, where wS refers to a 
particular word w followed by the sequence S, and where Sw refers to the sequence S 
followed by [[a]] the word w. 
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24. (Currently amended) The device of claim 16, where the variation of 
context in which the sequence occurs, HP(S), is calculated as 

HP(S) = MIN(HLP(S),HRP(S)) 
where MIN defines a minimum operation, S represents the sequence, HLP(S) is defined 
as the number of continuations to the left of the sequence that cover a predetermined 
percentage of all cases in the collection of documents and HRP(S) is defined as the 
number of continuations to the right of the sequence that cover the predetermined 
percentage of all cases in the collection of documents. 

25. (Previously presented) The device of claim 16, where the decision 
component is further to compare the results of the coherence component and the variation 
component to threshold values and identify the sequence as a semantic unit based at least 
in part on the comparisons. 

26. (Previously presented) The device of claim 16, where the processor 
further executes the instructions to implement: 

a heuristics component to apply one or more predefined rules to the sequence, 
where the decision component is further to determine whether the sequence constitutes a 
semantic unit based at least in part on application of the one or more rules. 

27. (Previously presented) The device of claim 26, where the one or more 
rules are exclusionary rules that determine when certain sequences are not semantic units. 
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28. (Previously presented) A device comprising: 
a memory to store instructions; and 

a processor to execute the instructions to implement: 
means for receiving a sequence of terms; 

means for calculating a first value representing a coherence of terms in the 
sequence of terms; 

means for calculating a second value representing variation of context in 
which the sequence occurs; 

means for comparing the first value to a first threshold and the second 
value to a second threshold; 

means for identifying that the sequence is a semantic unit based at least in 
part on the first value satisfying the first threshold and second value satisfying the second 
threshold; and 

means for outputting an indication that the sequence is a semantic unit 
based on identifying that the sequence is a semantic unit. 

29. (Previously presented) A computer-readable memory device that includes 
programming instructions to control at least one processor, the computer-readable 
memory device comprising: 

instructions for calculating a first value representing a coherence of terms in a 
sequence of terms; 
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instructions for calculating a second value representing variation of context in 
which the sequence occurs, where the variation of context in which the sequence occurs 
is calculated as a measure of entropy of the context of the sequence; 

instructions for identifying that the sequence is a semantic unit based on the first 
and second values; and 

instructions for outputting an indication that the sequence is a semantic unit. 

30. (Currently amended) The computer-readable memory device of claim 29, 
where the coherence of the terms in the sequence is calculated relative to a collection of 
documents. 

3 1 . (Previously presented) The computer-readable memory device of claim 
30, where the coherence of the terms in the sequence is calculated as a likelihood ratio 
that defines a probability of the sequence occurring in the collection of documents 
relative to parts of the sequence occurring. 

32. (Currently amended) The computer-readable memory device of claim 30, 
where the coherence of the terms in the sequence is calculated as: 

mAB)= mmi , 

i(/(AB),/(A))-L(/(- AB),f(- A)) 



where /(A) defines a number of occurrences of term A in the collection of documents, 
f(~A) defines a number of occurrences of a term other than term A in the collection of 
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documents, f(B) defines a number of occurrences of term B in the collection of 
documents, N defines a total number of events in the collection of documents, f(AB) 
defines a number of times term A is followed by term B in the collection of documents, 
and/(~A5) is a number of times a term other than A is followed by term B in the 
collection of documents, where 



where n and k are integers . 

33. (Previously presented) The computer- readable memory device of claim 
29, where the coherence of the terms in the sequence are defined as not being sufficient 
unless a threshold is met. 

34. (Previously presented) The computer-readable memory device of claim 

33, where the threshold is defined as: f(AB) > f^'fW ? where /(A) defines a 

./V 

number of occurrences of term A in the collection of documents, f(B) defines a number of 
occurrences of term B in the collection of documents, N defines a total number of events 
in the collection of documents, and/(Afi) defines a number of times term A is followed 
by term B in the collection of documents. 
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35. (Previously presented) The computer-readable memory device of claim 
29, where the variation of context in which the sequence occurs is calculated relative to a 
collection of documents. 

36. (Canceled) 

37. (Currently amended) The computer-readable memory device of claim 35, 
where the variation the measure of entropy of the context in which of the sequence 
occurs , H(S), is calculated as 

IIM(S) - MIN{UL{S)JIR(S)) 
H{S) = MIN(HL(S),HR(S)), 

HLM(S)= Y f(wS) J f(wS) ) 
? f(S) \ f(S) J 

r f(S) \ f(S) ) 

and 

r /(.?) I f(s> ) 

where MIN defines a minimum operation, S represents the sequence, HL(S) represents an 
entropy to the left of the sequence S, HR(S) represents an entropy to the right of the 
sequence S, f(wS) defines a number of times a particular term, w, appears in the 
collection of documents followed by the sequence, f(Sw ) refers to a number of times the 
sequence is followed by w in the collection of documents, and f(S) refers to a number of 
times the sequence S is present in the collection of documents. 
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38. (Currently amended) The computer-readable memory device of claim 35, 
where the variation of context in which the sequence occurs, HM(S), is calculated as 

HM(S) = MIN(HLM{S),HRM{S)), 
where MIN defines a minimum operation, S represents the sequence, HLMfS) is defined 

as a minimum of 1 - for each term w in the collection of documents, HRM(S) is 

f(S) 

defined as a minimum of 1 - for each term w in the collection of documents, 

f(S) 

f(wS) defines a number of times a particular term, vr, appears in the collection of 
documents followed by the sequence, f(Sw) refers to a number of times the sequence is 
followed by w in the collection of documents, and f(S) refers to a number of times the 
sequence is present in the collection of documents. 



39. (Currently amended) The computer-readable memory device of claim 35, 
where the variation of context in which the sequence occurs, HC(S), is calculated as 

HC(S) = MIN(HLC(S),HRC(S)) , 
where MIN defines a minimum operation, S represents the sequence, HLC(S) is defined 
as ^£(wtf) and HRC(S) is defined as J^S(Sw) , where S(X) is defined as one if 

sequence X occurs in the collection of documents and zero otherwise, where wS refers to 
a particular word w followed by the sequence S, and where Sw refers to the sequence S 
followed by [[a]] the word w. 
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40. (Currently amended) The computer-readable memory device of claim 35, 
where the variation of context in which the sequence occurs, HP(S), is calculated as 

HP(S) = MIN(HLP(S),HRP(S)) 
where MIN defines a minimum operation. S represents the sequence, HLP(S) is defined 
as the number of continuations to the left of the sequence that cover a predetermined 
percentage of all cases in the collection of documents and HRP(S) is defined as the 
number of continuations to the right of the sequence that cover the predetermined 
percentage of all cases in the collection of documents. 

41. (Previously presented) The computer-readable memory device of claim 
29, where the instructions for identifying that the sequence is a semantic unit include 
instructions for comparing the first and second values to first and second thresholds and 
identifying the sequence as a semantic unit when the first and second values satisfy the 
first and second thresholds. 

42. (Previously presented) The computer-readable memory device of claim 
29, where the sequence includes three or more words. 

43. (Previously presented) The computer-readable memory device of claim 
29, further including: 

instructions for applying one or more rules to the sequence, and 
where the instructions for identifying that the sequence is a semantic unit are 
further based at least in part on the application of the one or more rules. 
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44. (Previously presented) The system of claim 28, where the coherence of 
the terms in the sequence is calculated as a likelihood ratio that defines a probability of 
the sequence occurring in the collection of documents relative to parts of the sequence 
occurring. 

45. (Previously presented) The system of claim 28, where the variation of context in 
which the sequence occurs is calculated as a measure of entropy of the context of the 
sequence. 



